Iterative Improvement of Speaker Segmentation in a Noisy Environment Using High-Level Knowledge
نویسندگان
چکیده
Our goal is to process the soundtrack of a sports game (tennis) to understand the progress of the game and ultimately, infer its rules. The chair umpire’s speech is one of the most useful sources of information, and we focus on identifying the locations of this signal on the soundtrack. Although current techniques for audio segmentation can work well on this task when the acoustics of the trainingand testdata are well-matched, they fail when there is a mismatch, which occurs when the chair umpire, the microphone placement, the environmental noise etc. are different in the testand training-data. Our technique uses high-level knowledge of the syntax of the audio events (derived from the training data) to make a coarse estimate of the location of the umpire’s speech. The data gathered from these locations is then iteratively refined by contrasting it with data that is believed to belong to another audio class (also gathered using the technique described above). A model is built from this data that enables a more accurate determination of the location of the speech segments to be made. Our approach is applied to three different tennis games: all three have different umpires and different commentators. The results obtained show that it reaches almost the same performance level as that obtained using supervised methods, in which models for the speech are built using prior knowledge of their locations.
منابع مشابه
The effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment
The present study was conducted with the aim of the effects of segmentation and redundancy methods on cognitive load and vocabulary learning and comprehension of English lessons in a multimedia learning environment.The purpose of this study is an applied research and a real experimental study. The statistical population of the present study includes all people aged 14 to 16 who are enrolled in ...
متن کاملHigh Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation
Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...
متن کاملMinimizing Loss of Information at Competitive PLIP Algorithms for Image Segmentation with Noisy Back Ground
In this paper, two training systems for selecting PLIP parameters have been demonstrated. The first compares the MSE of a high precision result to that of a lower precision approximation in order to minimize loss of information. The second uses EMEE scores to maximize visual appeal and further reduce information loss. It was shown that, in the general case of basic addition, subtraction, or mul...
متن کاملHigh Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation
Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...
متن کاملAudio Indexing Using Speaker Identiication
In this paper, a technique for audio indexing based on speaker identiication is proposed. When speakers are known a priori, a speaker index can be created in real time using the Viterbi algorithm to segment the audio into intervals from a single talker. Segmentation is performed using a hidden Markov model network consisting of interconnected speaker sub-networks. Speaker training data is used ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011